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Abstract 

Flow watermarks efficiently link packet flows in a network in order to thwart various attacks such as 
stepping stones. We study the problem of designing good flow watermarks. Earlier flow watermarking 
schemes mostly considered substitution errors, neglecting the effects of packet insertions and deletions 
that commonly happen within a network. More recent schemes consider packet deletions but often at 
the expense of the watermark visibility. We present an invisible flow watermarking scheme capable 
of enduring a large number of packet losses and insertions. To maintain invisibility, our scheme uses 
quantization index modulation (QIM) to embed the watermark into inter-packet delays, as opposed to 
time intervals including many packets. As the watermark is injected within individual packets, packet 
losses and insertions may lead to watermark desynchronization and substitution errors. To address this 
issue, we add a layer of error-correction coding to our scheme. Experimental results on both synthetic 
and real network traces demonstrate that our scheme is robust to network jitter, packet drops and splits, 
while remaining invisible to an attacker. 
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I. Introduction 

Detecting correlated network flows, also known as flow linking, is a technique for traffic analysis with 
wide applications in network security and privacy. For instance, it may be utilized to expose a stepping 
stone attacker who hides behind proxy hosts. Alternatively, flow linking has been successfully used to 
attack low-latency anonymity networks such as Tor O, where anonymity is compromised once end flows 
are correctly matched. As network connections are often encrypted, it is infeasible to link flows directly 
relying on packet contents. However, matching flows using side information such as packet timings is 
possible, as their values remain to some extent unchanged even after encryption EJ-O. 

Earlier work in flow linking was based on long observation of flow characteristics, such as the number 
of ON/OF periods (H. Such passive techniques are fragile vis-a-vis network artifacts and require long 
observation periods to avoid large false alarm rates. Flow watermarking, an active approach, was suggested 
as an improvement. In this approach, a pattern, the watermark, is injected into the flow with the hope that 
the flow stays traceable after traversing the network as long as the same pattern can be later extracted [2l, 
|[6ll- |[T0l . Unlike passive schemes, flow watermarking is highly reliable and works effectively on short 
flows. 

The challenge of designing good flow watermarks is to keep the injected pattern robust to network 
artifacts yet invisible to watermark attackers|3 The robustness requirement guarantees that the injected 
pattern survives network artifacts, while the invisibility property prevents watermark removal attempts 
by active attackers. Most state-of-the-art schemes currently trade off one of the two properties at the 
expense of the other. In the so called interval-based schemes 0, lEl, a flow is divided into intervals, and 
all packets within selected intervals are shifted to form a watermark pattern. Given that a few packets 
would not greatly affect the pattern created in the entire interval, these schemes are robust against network 
artifacts such as packet drops and splits. However, shifting a large number of packets produces noticeable 
'traces' of the embedded watermarks and compromises the invisibility requirement lITTl . In inter-packet- 
delay (IPD)-based schemes ||6l, lH, the delays between consecutive packets are modulated to embed 
watermarks. Since only small perturbations are introduced in each inter-arrival time, watermarks are not 
visible. The drawback of this approach is that any packet loss or insertion during transmission can cause 

^The goal of watermark attackers is to prevent the success of flow linking by disrupting the detection or altogether removing 
the watermarks from the flow. 
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watermark desynchronization and severe decoding errors. 

In this paper, we present a new IPD-based flow watermarking scheme where invisible watermark 
patterns are injected in the inter-arrival-time of successive packets. We treat the network as a channel 
with substitution, deletion, and bursty insertion errors caused by jitter, packet drops, and packet splits or 
retransmission, respectively, and introduce an insertion, deletion and substitution (IDS) error-correction 
coding scheme to communicate the watermark reliably over the channel. At the same time, we preserve 
watermark invisibility by making unnoticeable modifications to packet timings using the QIM frame- 
work lfT2]| . Through experiments on both synthetic and real network traces, we show that our scheme 
performs reliably in presence of network jitter, packet losses and insertions. Furthermore, we verify the 
watermark invisibility using Kolmogorov-Smirnov lfT3l and multi-flow-attack tests iHTIl . 

The rest of the paper is organized as follows. Background on flow watermarking appears in ^ We 
describe notations and definitions in ^IIIl Our proposed scheme is presented in ^TVl We evaluate the 
performance of our scheme using synthetic and real traffic traces in ^ 

II. Background 

This section covers some background material on flow watermarking. First, we describe three applica- 
tion scenarios of flow watermarking. Second, we discuss some principles for designing good watermarking 
schemes. We conclude by surveying the literature. 

A. Applications 

We begin with a stepping-stone detection scenario where flow watermarks are used to find hidden 
network attackers. Figure [1] depicts an attacker Bob who wants to attack a victim A//c^ without exposing 
his identity. Bob first remotely logins to a compromised intermediate host Charlie via SSH |[T4ll . Then he 
proceeds by sending attack flows to Alice from Charlie's machine. Tracing packet flows sent to Alice's 
machine would implicate Charlie instead of Bob as the attacker. Hosts like Charlie, exploited to hide 
the real attack source, are called as stepping stones lO. In real life, attackers may hide behind a chain of 
stepping stones, making it hard for the victim, who only sees the last hop, to determine the origin of the 
attack. Fortunately, flow watermarking is a solution for tracing the attack source. Notice that an interactive 
connection is maintained along Bob-Charlie-Alice during the above stepping stone attack. Hence Alice 
can secretly embed a watermark in the packet flow heading back to Charlie. As this flow travels back to 
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Fig. 1. Detecting attackers behind stepping stones. Bob uses Charlie as a stepping stone to attack Alice so that his identity 
remains hidden from Alice. To traceback the origin of this attack, Alice injects a watermark on the flow sent back to the stepping 
stone. The path leading to Bob is exposed as every router along this path detects Alice's watermark on flows passing through. 



Incoming 




Outgoing 



(a) (b) 



Fig. 2. (a) Stepping stones in enterprise networks. An intruder compromises a host in the enterprise network as a 'stepping stone'. 
The enterprise embeds watermarks on all incoming flows and monitors all the outgoing flows. Any pair of incoming/outgoing 
flows with the same watermark indicates the existence of inside stepping stones, (b) An anonymity network. Incoming flows 
are shuffled before leaving the system to hide the pairing among communicating parties. 



Bob, the watermark could be subsequently detected by the intermediate routers (or firewalls), revealing 
the attack path and its true origin [fTSl . [fT6l . 

Another scenario of stepping-stone attack occurs in enterprise networks, as shown in Figure |2(a)| Here, 
intruders are trying to compromise hosts in an enterprise network to relay their malicious traffic [UTI . [flTI . 
To discover this kind of 'stepping stones' within the network, an enterprise can add watermarks on all 
incoming flows, and then terminate outgoing flows that contain the watermark since they most probably 
come from stepping stones. In a similar fashion, flow watermarking may be applied to attacking anonymity 
network systems lfT4]| . ifTSl - EOl . In order to hide the identities of communicating parties, an anonymity 
network shuffles all the flows passing through it, as shown in Figure |2(b)| If an attacker somehow 
discovers the hidden mappings between incoming and outgoing flows, the anonymity is compromised. 
Akin to the previous enterprise network scenario, this can be achieved by marking all incoming flows 
with watermarks and subsequently detecting the watermarks on the exiting flows. 
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B. Design Principles 

From above application examples, we summarize a list of principles for designing flow watermarks. 
The challenge of building an efficient scheme lies in the difficulty of achieving all desired properties 
simultaneously. 

• Robustness. One major advantage of flow watermarking over passive traffic analysis is the robustness 
against network noise. Take the stepping stone attack of Figure [H for example. The flow Alice sends 
back to Bob is subjected to jitter, packet drops, and packet splits during transmission. All these 
artifacts can alter the watermark, resulting in decoding errors. Without the ability to withstand these 
artifacts, flow watermarking is no different than passive analysis, which is fragile by nature. 

• Invisibility. A successful watermark pattern should stay 'invisible' to avoid possible attacks. For 
instance, in Figure |2(a)[ if the intruder notices that incoming flows contain watermarks, it can 



command the stepping stone to take precautionary actions (for instance, remove the watermarks 
altogether). 

Blindness. In a blind watermarking scheme, the watermark pattern can be extracted without the help 
of the original flow [21]. On the contrary, the original flow must be present in order to detect non- 



blind watermarks. Again, consider the example of Figure |2(a)[ In order to detect the stepping stone, 
the enterprise needs to perform watermark decoding on all outgoing flows. If a non-blind detection 
scheme is used, all exit routers are required to obtain a copy of each incoming flow. The resulting 
overheads of bandwidth and storage make such schemes impractical in large enterprise networks. 
• Presense watermarking. In conventional digital watermarking (e.g., multimedia watermarking), often 
a large amount of hiding capacity is desired as the injected watermarks are frequently used to achieve 
copyright among many users ll22]| . This, fortunately, is not required for most flow watermarking 
applications, since the main purpose of injecting watermarks here is to link flows initiated from the 
same sources. In other words, in digital watermarking terminology, zero-bit or presence watermarks 
suffice [21 J. Therefore, when designing a flow watermarking scheme, one may trade the capacity 
for other properties such as robustness (see the discussion in ^IV-Bl) . 

C. Related Work 

We briefly review previous flow watermarking literature. To the best of our knowledge, all the previous 
schemes fail to meet at least one of the above design principles, necessitating the development of a 
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TABLE I 

Summary of current watermarking schemes 







Invisibility Robustness Blindness 


Interval-based 


Ii2j| 

m 


no yes yes 
no yes yes 


IPD-based 


m 
m 

The proposed scheme 


yes no yes 
yes yes no 
yes yes yes 



comprehensive approach that meets all the aforementioned criteria. 

Earlier flow watermarks are of inter packet delay (IPD)-based type. In |[6l, the authors propose an 
IPD-based scheme that modulates the mean of selected IPDs using the QIM framework. Watermark 
synchronization is lost if enough packets are dropped or split. Therefore the scheme is unreliable. Another 
IPD-based scheme is presented in |[9l|, where watermarks are added by enlarging or shrinking the IPDs. 
This non-blind scheme achieves some watermark resynchronizations when packets are dropped or split, 
but is not scalable as the original packet flow is required during decoding. 

In interval-based schemes, instead of using the IPDs between individual packets, the watermark pattern 
is encoded into batch packet characteristics within fixed time intervals. In [2J, an interval-centroid scheme 
is proposed. After dividing the flow into time intervals of the same length, the authors create two patterns 
by manipulating the centroid of packets within each interval. The modified centroids are not easily changed 
even after packets are delayed, lost or split. A similar design is presented in |l8j|, where the watermark 
pattern is embedded in the packet densities of predefined time intervals. One problem with interval-based 
schemes is the lack of invisibility. Moving packets in batches generates visible artifacts, which can expose 
the watermark positions. Based on this observation, a multi-flow attack (MFA) was proposed in |ITT1. 
The authors showed that by lining up as few as 10 watermarked flows, an attacker can observe a number 
of large gaps between packets (see Figure. 10 in (TT\) in the aggregate flow, revealing the watermark 
positions. 

Table [J compares existing flow watermarking schemes with our proposed scheme. Unlike previous 
work, the new scheme satisfies all the desired properties. 
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III. Notations and Definitions 

In the discussion of the rest of the paper, we use the following notation, = {ai, a2, • • • , a^} is a 
sequence of length b; = {dtr " 5 <^r} is a sequence in starting with index r and ending with t. 
Specially, if r < t, a[ is an empty sequence, denoted by 0; © denotes the 'xor' operation. 

We also define the following variables used in our scheme. 

• is the IPD sequence of an original packet flow, where each delay, is positive real valued; 

• 1'^ is the IPD sequence of the same flow after injection of the watermark pattern; 

• V^' is the IPD sequence received by the watermark decoder; 

• is the binary watermark sequence; 

• is a sparse version of v^^, where N — sn\ 

• 5 is the sparsification factor and is integer valued; 

• f is the density of (see (O); 

• is a pseudo-random binary key sequence; 

• is a binary sequence, generated from the watermark and the key k^, and embedded into 
flow IPDs; 

• y^' is decoder's estimate of x^; 

• the estimate of the watermark sequence at the decoder; 

• A is a real-value step size used for IPD quantizations. It represents the strength of the watermark 
signal; 

• cr is the standard deviation of jitter; 

• Ps,P/, and Pd represent the probability of a substitution, an insertion, and a deletion event in the 
communication channel model of the network, respectively. 

IV. The Proposed Scheme 

A. Overview of the System 

Figure [3] depicts the schematic of our proposed scheme, which can be divided into two layers: 
the insertion deletion substitution (IDS) encoder/decoder and the quantization index modulation (QIM) 
encoder/decoder. In the upper layer, the watermark sequence v^^ is processed to generate an IDS error- 
correction codeword x^. On the lower layer, a QIM framework is used to inject x^ into the IPDs of 
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Fig. 3. An overview of the proposed flow watermarking scheme. The watermark sequence is first transformed into a 
codeword with the help of the key k^. is then embedded into flow IPDs using QIM. At the decoder, the IPDs are 
processed by a QIM decoder to extract the codeword , from which the IDS decoder recovers the watermark w^, subsequently. 




Fig. 4. Abstraction of communication channel. The IDS encoder/decoder pair help correct the dependent substitution, deletion, 
and bursty insertion errors on the channel. 



the flow. QIM embedding is blind and causes little change to packet timings l[T2ll . Upon receiving the 
flow, the QIM decoder extracts the pattern y^' . Subsequently an IDS decoder recovers the watermark, 
w^, from this pattern. 

If we abstract the QIM encoder, the network, and the QIM decoder together as a channel, which takes 
as the input and spits out y^', flow watermarking is equivalent to solving the problem of sending 
one bit of information (the presence of the watermark) over this compound communication channel (see 
Figure |4]). Codes for this compound channel must withstand dependent substitution, deletion, and bursty 
insertion errors. We next introduce each component of our scheme in details. 

B. Insertion Deletion Substitution (IDS) Encoder 

Our IDS error correction scheme is inspired by [53l . [E4 L where a 'marker' code is employed to provide 
reliable communications over a channel with deletion and insertion errors. However, the approach in f23], 
f2M is not directly applicable to our channel, as we need to deal with somewhat more complicated errors, 
such as dependent substitution, deletion, and bursty insertions which we discuss in giV-C2[ 
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The IDS encoder works as follows. The watermark sequence is first sparsified into a longer sequence 
of length = sn, as given by 



w 



i)s+i ^ ^ ('^j) ' i = 1^2, • • • ,n, (1) 



(j-i) 

where S{-) is a deterministic sparsification function that pads wj with zeros, and is known at the decoder. 
We denote by density / the ratio of 'V in w^, i.e., 

/ = (2) 

/ is a decoding parameter shared with the IDS decoder. The sparsified watermark is then added to 
a key to form the codeword x^: 

Xi = Wi®ki, z = 1,2, • • • ,7V, (3) 

where is pseudo-random key sequence which is also known at the decoder. 

Let us work on a small example of embedding one bit of watermark = 1 in a length 8 sequence. 
First wi is sparsified into an 8 bit sequence = 10000000 (the sparsification factor s = 8). Then 
we add this sparse sequence to the first 8 bit of our key, k^ = 11111011. The resulting codeword is 
= 01111011. Because x^ is only different from the key at one position, the decoder could infer the 
positions of deleted or inserted bits by comparing the received codeword with the key. For instance, if the 
decoder receives a codeword = 0111011, one bit shorter than the key, then it knows that most likely 
a bit T from the second run was lost during transmission. Based on this observation, a probabilistic 
decoder can be developed to fully recover embedded bits, as will be discussed in ^IV-DI 

Since is sparse, the codeword x^ is close to the key, which is known at the IDS decoder. Therefore, 
the IDS encoding helps synchronize the lost/inserted bits at the cost of information capacity over the 



channel, which is not a concern for flow watermarking (see §II-BI) . 

C. Insertion Deletion Substitution (IDS) Channel 

I ) QIM Embedding: The codeword x^ is injected into IPDs of the original flow using QIM embedding. 
Given a quantization step size A, the QIM encoder changes the IPD, into an even (or odd) multiplier 
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Packets sent 



Time 



-e- 



Packets received 



Ii 



-e- 



A 

4 



Fig. 5. An example of substitution errors caused by network jitter, 'x's denote even quantizers and 'o's odd quantizers. The 
bit embedded in h is T, but the decoded bit from Ii (delay between received packets and 1) is '0'. 



of ^ given the embedded bit Xi is a bit (or 1). The IPDs after modifications are given by 



ma.<:(Ej,i/j-Ej=iJ^,0) 
A 



A 



if X4 — 0, 



(4) 



+ 0.5 A if Xi = 1, 



for i — 1,2, ■■■N, where the ceiling function describes the operation that adds minimum delays to 
Packet i to form the desired multiplier of ^. 

At the QIM decoder, each embedded bit is extracted based on whether a received IPD is closer to an 
even or odd quantizer, i.e.. 



Vi 



Lfj mod 2 iff-[fj<0.5, 



(5) 



A J "^"^ ^ A LA 

fl mod 2 iff-Lf,/.... 

2) Channel Model: In presence of network artifacts, received IPDs, are different from the original 
IPDs I^, leading to errors in decoding x^. Substitution errors occur when network jitter alters IPDs 
largely. Figure |5] depicts one example where an embedded bit is flipped by jitter. In Figure |5l the bit 
'xi = 1' was originally encoded in the IPD /i, resulting in = ^. But at the QIM decoder, the received 
IPD Ii is pushed by jitter into interval A), and thus decoded as 'yi = 0'. In absence of packet drops 
or splits, a watermark bit flips if the IPD jitter is larger than ^. Following the observation of previous 
work that shows IPD jitter (within a certain period of time) is approximately i.i.d. zero-mean Laplace 
distributed AH, the probability of a substitution error by jitter can be estimated as 



1 -A 

2 



(6) 
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Fig. 6. Merging of IPDs as the result of packet drops. The deletion of Packet 1 merges the first two IPDs /i and I2. and the 
deletions of Packet 3 and 4 merge /s, I a and h. 



where F( ) is the Laplacian pdf and cr^ is its variance. 

Decoding errors also occur when packets are dropped. As packet drops lead to the merger of successive 
IPDs, the resulting error contains both deletion and substitution, which we refer to as dependent deletion 
and substitution error. For instance in Figure |6l deletion of Packet 1 merges the IPDs /i and I2 into a 
large received IPD /{. As a result, instead of xi and X2, only one bit xi © X2 is received at the decoder. 
We consider this case as a deletion of xi, and possibly a substitution of X2. In this paper, we assume 
that each packet is dropped independently with probability P^. For the convenience of analysis, we also 
assume that the head of watermarked packet sequence. Packet 0, is not dropped. 

The last type of error comes from packet insertions. This happens when packets are split to meet a 
smaller packet size limit, or when TCP transmission is triggered by network congestions. Both cases 
cause bursty insertions of packets. An example of such a scenario is depicted in Figure |7] Packet 2 is 
split into three smaller ones, creating two new IPDs (2-2' and 2' -2", both with zero length). Therefore, 
two extra '0' bits would be decoded in y^'. In general, newly generated packets are mostly right next 
to the original one, hence we consider all inserted bits are'O'so Furthermore, we assume the number of 
inserted packets follows a geometric distribution with parameter P/. 



^Our methodology can be extended to cover the case that both '0' and '1' bits may be inserted. 
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Fig. 7. A scenario with packet insertions. Packet 1 is split into two packets, and Packet 2 is split into three pieces. 

D. Insertion Deletion Substitution (IDS) Decoder 

We estimate each watermark bit from using the maximum UkeUhood decoding rule given by 

Wj^^ig max P (y^'l^^- ) , j = l,2, (7) 

Since is a deterministic function of w, we derive the Hkehhood in (|7]) based on the dependency 
between y^' and over the IDS channel. Suppose the QIM decoder received i' — 1 packets after the 
first i — 1 packets were sent out by the QIM encoder, and assume the i' — 1^^ packet in the received flow 
corresponds to the q^^ packet in the sent flow or a packet inserted immediately after it < z — 1). The 
possible outcomes after Packet i is sent are: 

• if Packet i in the sent flow is lost and no packets are inserted, the QIM decoder cannot decode 
new bits; 

• if Packet i is lost but I > new packets are inserted right after it, the decoder could decode I bits, 
yl',^^~^, from newly received IPDs; 

• if Packet i is received and additionally / > packets are inserted, the decoder can decode / + 1 
new bits, y}',^^. 

In the last two cases, the new IPD between the — 1^^ packet and i^^^ packets in the received flow 
corresponds to the merger of all IPDs between Packet q and Packet i in the sent flow. Hence, the first 
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(8) 



new bit, yi^ , is given by 

Xq^l ® Xq^2 • • • ® Xi W. p. 1 - P5, 

Xq^l® Xq^2" ' ® Xi®l W. p. , 

where is the probabiUty of a substitution error given in (|6j. The remaining new bits (or 
are just '0' bits resulting from bursty packet insertions. 

1 ) Hidden Markov Model: To capture the evolution of newly decoded bits from the received flow, we 
define the state after sending each packet with the pair (x^, d^), for i = 1, 2, • • • A^, where 

• The accumulated bit x'- it the sum of all bits resulting from merger of the IPDs between Packet i 
and the previous packet that was received at the decoder. If Packet i — 1 was received, then the 
x[ is just the bit embedded on the IPD between Packet i and i — 1, i.e., Xi. On the other hand, if 
Packet i — 1 was completely lost (i.e., after its deletion, there were no insertions), x[ would be the 
sum of current bit Xi and bits embedded on previously merged IPDs, i.e., Xi © To sum up. 



(9) 



Xi w. p. 1-Pd{l-Pi), 

xiex'i_^ w. p. Pd{i-Pi). 

Recall from Q that is generate! using the key and the sparse watermark sequence w. We will 
model the sparse watermark bits Wi's as independent Bernoulli{f) random variables. Therefore (|9) 
can be rewritten as 

f 

h w. p. (1 - /) (1 - - p,)) , 

ki (Bl w. p. /(l-Prf(l-Pj)), 

fci©xU w. p. {l-f)Pd{l-Pi), 

fci©x^_i©l wp. fPd{l-Pi). 
Note that from Q, we can rewrite (|8j as 



(10) 



Vi' = < 



x' 



(11) 



W. p. 1 - Ps, 

x-©l w. p. Pg. 

The drift di is the shift in position of the sent Packet i in the received flow, i.e., if Packet i was not 
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Fig. 8. The hidden-Markov model for the IDS channel. The observations are the codewords (yi's) received by the IDS decoder, 
and the hidden states keep track of the drift and accumulated bit when sending every packet. 



lost, it would appear at position = i + di in the received flow. Given rf^-i, the drift of Packet i 
is updated as 



di 



(12) 



W. p. Prf(l-P/), 

rfz-i + /,/>0 w. p. (^p^pj+i(l-P,) + (l-P^)pj(l-P,)), 

where the first case occurs when Packet i — 1 was dropped with no new packets inserted, and the 
second case occurs when total of / packets are received either because Packet i — 1 was dropped 
and there were / + 1 insertions or Packet i was received and there were / insertions. 
Combine ([TTI) and ([TOl) . and given = z + d^, we have 

ki w. p. ((1 - /)(1 - Ps) + fPs) (1 - Pd{l - Pi)) , 

A:, e 1 w. p. (/(I - Ps) + (1 - f)Ps) (1 - Pd{l - Pi)) , 

x[_^ e k, w. p. ((1 - /)(1 - Ps) + fPs) Pd (1 - Pi) , 



(13) 



x[_,®k,®l w. p. (/(l-P,) + (l-/)P,)Pai-P/). 

Equation ([T3l) captures the HMM with hidden states of {x[^di)^i = 1, 2, • • • and observation states 
of y^', as depicted in Figured The state transition probabilities P i^lz\'^^"_^^x'-^ di\x'-_^^ di-i^ can be 
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derived using ([TO]) , ([T2]) and ([T3l) , summarized as 

(1 - - Pi) if = © h, di = - 1 and yt-\Xt_, = 

/Pd(l - P/) if x[ = ® fci ® 1, = c?i_i - 1 and ytlXdl, 

(1 - /)(1 - P,)(l - Pi){PdP\-^^ + (1 - Pd)P\) if x', = fci, = di_i + I and yi_i+rf._, = 

/(I - P,)(l - Pi){PdP\-^^ + (1 - Pd)Pi) if x', = fci ® 1, di = di_i + ; and yi_i+d._, = 

(1 - f)Ps{l - Pi){PdP\-^^ + (1 - Pd)Pi) if = ki, di = di_i + I and yi-i+d,_, = ® 1, 

/P,(l - Pi){PdP\-^^ + (1 - Pd)Pi) if = A:i ® 1, di = di_i + I and yi-i+d,_, = x[_^ 

(14) 

For example, after sending Packet i — 1, the system state is {x^_l, di_i). If Packet i — 1 is lost and 
no packets are inserted. Then from (IT2t . the drift of Packet i becomes di = di_i — 1, and no new bit 
is decoded, i.e., y*Zj^J_^ is an empty sequence. Additionally, the IPD between Packet i and i — 1 is 
added to previously merged IPDs such that x'^ is decided based on the last two cases in ([TOb . Overall, 
the transition probability in this scenario is given by 

P(0,x^,rf,_i-l|xU,rf,_i) (15) 

yfPd{l-Pi) if x[^x[_^®h®l. 

2) Forward-Backward Algorithm: For the HMM in Figure [8l we apply the forward-backward algorithm 
to derive the posterior probabihties P(y^ j = 1, 2, • • • , n. Let us define tho forward quantity as the 
joint probability of bits yl~^^^" decoded before sending Packet i at the hidden state of (x^, di), which 
is given by 

Fiix'i,di) = Piy\-'+''%x[,d,), i = l,2,... ,iV. (16) 
The forward quantities can be computed recursively using transition probabilities in ([T4]) as 

Fiix'i,di) = J2 Fi.M-i,di-i)P{yiZ^_^,x'i,di\x'i_„di.i). 



(17) 



di-i 

Similarly, we define the backward quantity as the conditional probability of decoding the rest of the 
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bits in the received flow, given the current state (x^, di), 

B,{x[,d,) = P{y^;^X^d,), z = l,2,--- ,7V. (18) 

The backward quantities can also be computed recursively as 

Bi{x-,di) = V P{yl'lf^\x-^^,di^i\x[,di)Bi^i{x-^^^ 

di+1 

Given the forward/backward quantities, the posterior likelihood of the watermark bit wj is given by 



(20) 



- ^ ^{j-l)s (^(j-l)s' ^(j-l)s) ^js (^5' djs)Bjs{Xj^, djs) , 

d{j-i)s,djs 

where the first equality follows from our watermark sparsification function in ([T]), and the quantity 

Fj^{xi^di) is defined as 

Fjs{x,,di) = P (^(7^^")^+^^^^ , (j - 1)5 + 1 < z < js. (21) 

The quantity F'-g{x[^ di) can be calculated recursively as 

(22) 



Fjs{x^, di) — Fjg(x^_i, di-i)P \ y-_j^^'^_^^ x^, di\x^_i^ di-i,w^^j_^^^^^ , 



where P ^^i, ^iki-i, ^i-i, is given by 

pfy^-^+'^i r'. dW- . d- 1 ilP^ \ - 

Pd{l - Pi) if di = - 1 and x'-^Wi®ki® x[_^, and I/lZi+J_^ = 0, 

P,(l - P/) (^p^p/-^^-+i + (1 _ P^)P/^-^^-) if > = e A:, and = 1, 

(1 - P,)(l - Pi) (p^p/^-^^-+i + (1 _ P^)P/^-^^-) if > = fc, and = x[_^. 



< 



(23) 

Once the posterior probabilities for all watermark bits are calculated, the watermark sequence, can 
be estimated using maximum likelihood rule of (|7]). Finally, the presence of the watermark in a flow 
is decided based on the correlation value of the estimated watermark, w^, and the original watermark 
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TABLE II 

True positive rates with varying watermark parameters when false positive rate is fixed below 1%. 



^^^A (ms) 
n ^^^^^ 


20 


60 


100 


10 


0.0310 


0.6050 


0.6224 


30 


0.0310 


0.9790 


0.9970 


50 


0.0272 


0.9990 


1 



TABLE III 

True positive rates under varying IPD jitter with false positive rates below 1%. 



Jitter Std. Dev. (ms) 


10 


20 


30 


40 


Synthetic traffic 
Real traffic 


1.000 
1.000 


0.989 
0.989 


0.770 
0.652 


0.232 
0.193 



sequence, w". 

V. Evaluation 

We tested our scheme for two groups of traces: synthetic packet flows of length 2000 generated from 
Poisson process with average rate of 3.3 packets per second (pps), and real SSH traces of length 2000 
collected in CAIDA database with average rate of 0.865 pps |[25]| , which represent typical traffic in 
human-involved network connections, where flow watermarks are most applicable. 

A. Parameter Selection 

The first test examined the effects of watermark length n and IPD quantization step size A. We varied 
n over {10,30,50}, A over {20, 60, 100} ms and fixed the sparisificatoin factor s = 10. The deletion 
and insertion probabilities and the network jitter were set to = 0.1, P/ = 0, a = 10 ms, respectively. 
5000 synthetic flows were embedded with watermarks and another 5000 unwatermarked ones served as 
the control group. 

Table ini shows the true positive rates of our test, when false positive rates were kept under 1%. As we 
increase watermark length or quantization step size (embed a 'stronger' pattern), detection error decreases. 

For the tests in this section, we fix the watermark parameters to {A = 100 ms, n = 50, 5 = 10}, which 
had the best performance in Table III 
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TABLE IV 

True positive rates for varying Pa with false positive rates below 1%. 



Pd 


1% 


2% 


3% 


10% 


Synthetic traffic 
Real traffic 


1.000 
1.000 


1.000 
1.000 


1.000 
1.000 


0.995 
0.996 



TABLE V 

True positive rates for varing Pi with false positive rates below 1%. 



Pi 


1% 


5% 


10% 


20% 


Synthetic traffic 
Real traffic 


1.000 
1.000 


1.000 
1.000 


1.000 
1.000 


0.500 
0.568 



TABLE VI 

True positive rates for varying PuPa with false positive rates below 1%. 



PuPd 


1%,1% 


5%,5% 


10%,10% 


Synthetic 
Real 


1.000 
1.000 


1.000 
1.000 


0.764 
0.662 



B. Robustness Tests 

We evaluated watermark robustness against network jitter, and packet loss and insertion. 

1) IPD jitter: We tested IPD jitter with standard deviation a varied over {10, 20, 30, 40} ms. The 
packet drop and split probabilities were P^i = 0.1 and P/ = 0, respectively. This time, we watermarked 
1000 flows from both synthetic and SSH traces. 

The true positive rates are given in Table Inl Notice that the watermarks were detected with accuracies 
over 98%, even when jitter was as high as 20 ms. The detection performance falls sharply when jitter 
standard deviations exceeds 40 ms. However, such excessively large jitter rarely occurs at proper network 
conditions. Hence, our scheme withstands network jitter in normal operating conditions. 

2) Packet deletion and insertion: One major improvement of our design over previous work is ro- 
bustness against packet deletion and insertion. To verify this, we tested our scheme in a network 
with: solely packet deletion with probabilities P^ = {0.01,0.02,0.03,0.1}, solely packet insertion with 
probabilities P/ = {0.01,0.05,0.1,0.2}, and both deletion and insertion with probabilities (Pd^Pi) = 
{(0.01,0.01), (0.05,0.05), (0.1,0.1), (0.15,0.15)}. During afl the tests, the standard deviation of jitter 
was fixed as a = 10 ms, and 1000 flows from both synthetic and SSH traces were used. 
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TABLE VII 

Average KS distances between watermarked and unwatermarked synthetic traces. 



^^^A (ms) 
n ^^^^^ 


100 


80 


60 


30 


0.0177 


0.0138 


0.0101 


40 


0.0233 


0.0181 


0.0133 


50 


0.0284 


0.0223 


0.0160 



TABLE VIII 

Average KS distances between watermarked and unwatermarked SSH traces. 



^^^^A (ms) 
n ^^^^^ 


100 


80 


60 


30 


0.0091 


0.0081 


0.0071 


40 


0.0120 


0.0111 


0.0091 


50 


0.0158 


0.0139 


0.0123 



The results in Tables HVl - rvTl demonstrate watermarks were detected with high accuracies when 5% of 
packets were dropped and inserted. 

C. Visibility Tests 

We evaluated watermark invisibility with two tests: the Kolmogorov-Simimov (KS) test and the 
multiflow attack (MFA) test. 

KS test is commonly applied to comparing distributions of datasets. Given two data sets, the KS distance 
is computed as the maximum difference of their empirical distribution functions ifTSll . For two flows A and 
B, the KS distance is given by sup(|F^(x) — Fb{x)\), where Fa{x) and Fb{x) are the empirical pdfs of 

X 

IPDs in A and B. We claim two flows are indistinguishable if their KS distance is below 0.036, a threshold 
suggested in [13]. We calculated the average KS distance between watermarked and unwatermarked flows 
using both synthetic and SSH traces. The results are tabulated in Tables IVIII and IVIIIl None of the KS 
distances exceed the detection threshold of visibility, which implies the embedded watermarks did not 
cause noticeable artifacts in the original packet flows. 

MFA is a watermark attack that detects positions of embedded watermarks in interval-based schemes. 
When flows which were watermarked using the same watermark are aggregated, the aggregate flow shows 
a number of intervals containing no packets (see Figure 10 in [11 J). To test whether such 'visible' pattern 
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TABLE IX 

Statistics of blank intervals in the aggregated flow from synthetic traces. 





Watermarked 


Unwatermarked 


Mean 


24.07 


25.96 


Standard Deviation 


5.246 


5.187 



TABLE X 

Statistics of blank intervals in the aggregated flow from SSH traces. 





Watermarked 


Unwatermarked 


Mean 


403 


395.67 


Standard Deviation 


20.54 


16.84 



exists in flows watermarked using our scheme, we combined 10 watermarked and 10 unwatermarked 
flows for both the synthetic and SSH traces, and divided the aggregated flows into intervals with length 
of length of 70 ms. We then counted the number of blank intervals with no packets in each aggregate 
flow. This procedure was repeated 1000 times, and the resulting blank interval statistics are shown in 
Tables IXl and 1X1 For both synthetic and SSH traces, we see that the number of blank intervals does not 
change much after watermarks were embedded. Figure |9] depicts packet counts in each interval of length 
70 ms in the aggregated synthetic traces. Comparing Figures |9(a)| with |9(b)l no clear watermark pattern 
is observed. The same observation was made in Figure (TO] which depicts packet counts of SSH traces. 
Therefore, our scheme is resistant to MFA. 

To achieve simultaneous watermark robustness and invisibility, we embed a sparse watermark using the 
QIM embedding into flow IPDs. Modeling the network jitter, deletions, and insertions as a communication 
channel descried by a HMM, and employing an IDS decoder, we can reliably decoder the watermark. 
The QIM embedding meanwhile guarantees that watermark remains invisible to attackers. 
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